Checkpointing algorithms and fault prediction

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Checkpointing algorithms and fault prediction

This paper deals with the impact of fault prediction techniques on checkpointing strategies. We extend the classical first-order analysis of Young and Daly in the presence of a fault prediction system, characterized by its recall and its precision. In this framework, we provide an optimal algorithm to decide when to take predictions into account, and we derive the optimal value of the checkpoin...

متن کامل

Impact of fault prediction on checkpointing strategies

This paper deals with the impact of fault prediction techniques on checkpointing strategies. We extend the classical analysis of Young in the presence of a fault prediction system, which is characterized by its recall and its precision, and which provides either exact or windowbased time predictions. We succeed in deriving the optimal value of the checkpointing period (thereby minimizing the wa...

متن کامل

Fault Tolerance and Checkpointing

Research and applications of clusters of workstations are growing rapidly. One of the major area is fault tolerance. This report describes two issues concerned: correctness and performance. After a number of techniques to improve performance are described, new research directions, diskless checkpointing and Java checkpointing, are introduced.

متن کامل

Checkpointing and Rollback Recovery Algorithms for Fault Tolerance in MANETs: A Review

Checkpointing and Rollback Recovery Algorithms for Fault Tolerance in MANETs: A Review Sushant Patial Department of Computer Science, Himachal Pradesh University Shimla-5 Email: patialsushant @gmail.com Jawahar Thakur Department of Computer Science, Himachal Pradesh University Shimla-5 Email: jawahar.hpu @gmail.com -------------------------------------------------------------------ABSTRACT-----...

متن کامل

Fault-tolerant finite-element multigrid algorithms with hierarchically compressed asynchronous checkpointing

We examine novel fault tolerance schemes for data loss in multigrid solvers which essentially combine ideas of checkpoint-restart with algorithm-based fault tolerance. To improve efficiency compared to conventional global checkpointing, we exploit the inherent data compression of the multigrid hierarchy, and relax the synchronicity requirement through a local failure local recovery approach. We...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Parallel and Distributed Computing

سال: 2014

ISSN: 0743-7315

DOI: 10.1016/j.jpdc.2013.10.010